ReadMe for Replication Files

Manuscript Title: Ambition and Conflict in State Legislatures

Authors: Christian Fong and Michael Kistner

Date: October 21, 2025

---

Computing Environment

R version 4.4.0 (2024-04-24 ucrt) -- "Puppy Cup"
Platform: x86_64-w64-mingw32/x64

R Packages Used:
- tidyverse 2.0.0
- stringr 1.5.1
- cowplot 1.1.3
- fixest 0.12.0
- did 2.1.2
- ggrepel 0.9.5
- modelsummary 1.4.5
- scales 1.3.0
- MatchIt 4.7.2
- dplyr 1.1.4
- lfe 3.1.1
- stargazer 5.2.3

---

Data Files

1. House Elections (Jacobson-Algara).csv - CSV format
   - ORIGINAL DATA SOURCE(S): https://dataverse.harvard.edu/dataverse/carlosalgara
   - Contains U.S. House election data with 10,440 observations across 5 variables (year, district, dv, dpres, pwin)
   - Used to calculate partisan lean of congressional districts over time

2. Cook PVI (2020).csv - CSV format
   - ORIGINAL DATA SOURCE(S): https://www.cookpolitical.com/cook-pvi/2021-partisan-voting-index
   - Contains Cook Partisan Voter Index scores for 435 congressional districts in 2020
   - Includes 3 variables: district identifier, Cook PVI score, and year
   - Used to classify districts as competitive or safe in 2020

3. Redistricting Cycles.csv - CSV format
   - ORIGINAL DATA SOURCE(S): Congressional seats and candidacies by state legislators aggregated from DIME (https://data.stanford.edu/dime); state legislative seats from NCSL (https://www.ncsl.org/about-state-legislatures/state-partisan-composition)
   - Contains 198 observations across 7 variables
   - Variables include: state, chamber, redistricting cycle, percent ran for Congress, seat ratio, log ratio, and state-chamber identifier
   - Used for visualizing the relationship between the state legislative-congressional seat ratio and members running for Congress

4. State Legislative Chamber Panel.csv - CSV format
   - ORIGINAL DATA SOURCE(S): Congressional seats and candidacies by state legislators aggregated from DIME (https://data.stanford.edu/dime); state legislative seats and partisan composition from NCSL (https://www.ncsl.org/about-state-legislatures/state-partisan-composition); state legislator variables from SLERS (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/3WZFK9); ideology variables from Shor-McCarty legislator data (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GZJOT3); bipartisan cosponsorship calculated from LegiScan bill sponsorship data (https://legiscan.com/data-exports)
   - Contains 1,075 observations across 20 variables
   - Variables include: state, chamber, election year, percent ran for Congress, polarization measures, seat ratios, party composition, and legislator demographics
   - Panel data of state legislative chambers from 1998-2022 used for chamber-level regression analyses

5. State Legislator Panel.csv - CSV format
   - ORIGINAL DATA SOURCE(S): Congressional seats and candidacies by state legislators from DIME (https://data.stanford.edu/dime); state legislative seats and partisan composition from NCSL (https://www.ncsl.org/about-state-legislatures/state-partisan-composition); state legislator variables from SLERS (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/3WZFK9); ideology variables from Shor-McCarty legislator data (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GZJOT3); bipartisan cosponsorship calculated from LegiScan bill sponsorship data (https://legiscan.com/data-exports); district partisanship from the American ideology project (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/BQKU4M); district overlap calculated from shapefiles from the U.S. Census Bureau (https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html)
   - Contains 29,833 observations across 15 variables
   - Variables include: state, chamber, election year, party, legislator identifiers, seniority, viability for Congress, party vote shares, district partisanship, bipartisan collaboration measures, and congressional candidacy indicator
   - Individual-level panel data of state legislators from 2010-2020 used for difference-in-differences analyses

6. Matching Input.csv - CSV format
   - ORIGINAL DATA SOURCE(S): state legislative seats and partisan composition from NCSL (https://www.ncsl.org/about-state-legislatures/state-partisan-composition); state legislator variables from SLERS (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/3WZFK9); ideology variables from Shor-McCarty legislator data (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GZJOT3); bipartisan cosponsorship calculated from LegiScan bill sponsorship data (https://legiscan.com/data-exports); district partisanship from the American ideology project (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/BQKU4M); district overlap calculated from shapefiles from the U.S. Census Bureau (https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html)
   - Contains 28,480 observations across 31 variables
   - Variables include: state, chamber, election_year, party, whether the member is in the majority, name, state legislative district, length of service, bipartisan collaboration measures, extremity, measures of viability, gender, district partisanship, position in the legislature, and legislature characteristics
   - Individual-level panel data of state legislators from 2010-2020 used for matching analyses

7. Congressional Candidacies.csv - CSV format
   - ORIGINAL DATA SOURCE(S): Congressional seats candidacies by state legislators from DIME (https://data.stanford.edu/dime); state legislative seats and partisan composition from NCSL (https://www.ncsl.org/about-state-legislatures/state-partisan-composition); state legislator variables from SLERS (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/3WZFK9); ideology variables from Shor-McCarty legislator data (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GZJOT3); bipartisan cosponsorship calculated from LegiScan bill sponsorship data (https://legiscan.com/data-exports); district partisanship from the American ideology project (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/BQKU4M); district overlap calculated from shapefiles from the U.S. Census Bureau (https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html)
   - Contains 1129 observations across 11 variables
   - Variables include: legislator identifier, whether the legislator ran for Congress in that cycle, bipartisan cosponosrhip score, lagged bipartisan cosponsorship score,
district partisanship, seniority, congressional district the legislator overlapped with, state legislative chamber in which the legislator served, election cycle of the
petrtient election, state in which the legislator served, and whether the legislator is a member of the majority party
   - Individual-level data of state legislators who ran for Congress and copartisans in the same chamber who could've ran for the same chamber

---

Code Files

1. Creating Seat Competition Plot.R
   - Reproduces Figure 1 from the manuscript
   - Calculates and visualizes the number of competitive versus safe U.S. House seats from 1974-2020
   - Competitive seats defined as those with Cook PVI between -5 and +5 percentage points
   - Outputs a line plot showing the decline in competitive districts over time

2. Chamber Level Analyses.R
   - Reproduces Table 1, Figure 3, and Figure D.1 from the manuscript
   - Analyzes the relationship between the ratio of state legislative to congressional seats and bipartisan collaboration at the chamber level
   - Estimates both reduced form OLS regressions and two-stage least squares instrumental variable models
   - Table 1: Shows effects on roll call polarization and bipartisan cosponsorship (1999-2020)
   - Figure 3: Displays predicted levels of polarization and cosponsorship at varying seat ratios with example state chambers
   - Figure D.1: Scatterplots showing correlation between logged seat ratio and percent of legislators running for Congress

3. Difference-in-Differences Analyses.R
   - Reproduces Figure 4, Figure J.1, and Table I.1 from the manuscript
   - Implements within-legislator difference-in-differences designs to estimate the causal effect of viability for Congress on bipartisan collaboration
   - Figure 4: Compares three DiD estimation approaches (two-way fixed effects, stacked DiD, and Callaway-Sant'Anna estimator) for estimating the effect of viability on bipartisan collaboration
   - Figure J.1: Robustness checks testing sensitivity to different viability thresholds (any overlap, 25%, 33%, 50% co-partisan representation)
   - Table I.1: Tests whether viability increases the probability of running for Congress (mechanism validation)

4. Legislator-Level Analyses.R
   -Reproduces Table 2, Table C.1, Table C.2, Table E.1, Table F.1, Table F.2, Table F.3, Table G.1, Table G.2, Table G.3, and Table H.1 from the manuscript
   -Implements coarsened exact matching analysis and associated robustness checks
   -Table 2: Main matching results
   -Table C.1 and C.2: Heterogeneous treatment effect analysis for swing vs. safe congressional districts
   -Table E.1: Robustness to using OLS instead of matching
   -Table F.1, F.2, and F.2: Robustness to alternative measures of viability with different overlap standards
   -Table H.1: Correlation between bipartisan collaboration and running for Congress 